Adaptive Redundancy Removal in 
Data Transmission 

By R. W. LUCKY 

This paper suggests an adaptive filter, similar to that used in automatic 
equalization, -for use as a predictor in data compression systems. It dis- 
cusses same of the applications of this adaptive predictor in digital data 
transmission. In the event o] redundant data input to the system the pre- 
dictor coidd be used to lower the transmitted power output required jor a 
given error rate or to decrease the error rate while maintaining constant 
transmitted power. The action oj these redundancy-removal and restoration 
systems is analyzed in simple cases involving Markov inputs. 

I. INTRODUCTION 

In the design, analysis, and testing of data transmission systems it 
is invariably assumed that the input digits are identically distributed, 
independent random variables. However, in many actual systems the 
input digits may arise from a physical source which imposes signifi- 
cant correlations in the data train. In these cases we know that the 
entropy of the source is less than when independent digits are pre- 
sented. Accordingly, we should be able to use the redundancy in the 
input message to provide, in some sense, more efficient transmission. 
For example, wc could imagine the redundancy being used to de- 
crease bandwidth, to increase speed, to lower probability of error, or 
to lower average signal power. 

Redundancy removal in analog transmission systems was investigated 
in the early 1950's by Oliver, Kretzmer, Harrison, and Elias 1-4 . Each 
of these papers relied on the theory of linear prediction as developed 
by Wiener in the early 1940's. 6 Figure 1 shows the basic idea. It is 
assumed that the input samples are taken from a stationary time series 
{x n \. These samples are passed through a linear filter whose output 
x n at time t n forms a linear prediction of the sample x n based on all 
preceding samples. The prediction x n is subtracted from the actual 
sample x n and only the error e n is passed on for further processing and 
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Fig. 1 — Predictive system. 



transmission. Since the portion {£„} "removed" from the input sequence 
is a deterministic function of the error sequence, no information has 
been lost and the original sequence can be reconstructed at the receiver 
by the feedback loop shown in the figure. 

The philosophy of predictive systems has been widely studied for 
its application in bandwidth compression of telemetry data and of 
television; for example, see Kortman, Davisson, and O'Neal. 6 - 8 In these 
examples the error samples e k are quantized and transmitted by pcm. 
Because of redundancy, that is, predictability, in the source data, 
fewer digits per sample (and consequently less bandwidth) are re- 
quired for transmitting the error samples than for transmitting the 
original samples for a given fidelity of reconstruction. 

One of the difficulties with these data compression systems is in 
determining the predictor filter. Although the theory of linear predic- 
tion for stationary time series is well known, the practical determina- 
tion of the statistical properties of the input data and the realization 
of the corresponding optimum filter are nearly impossible. Generally, 
an approximate average statistical description is used for the input 
data and a considerably simplified version of the optimum filter is 
constructed. Most existing compression schemes appear to use only 
linear or zero-order extrapolation of the previous sample to form the 
prediction of the succeeding sample. More complicated and adaptive 
prediction techniques have been confined to computer-processed data. 

In this paper we describe a simply-instrumented adaptive filter for 
use as a predictor. This filter uses a finite tapped delay line whose 
coefficients are continually adjusted to provide a least squares predic- 
tion of incoming data. The coefficient settings are based on the sta- 
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tistics of a finite section of the past data (the learning period). As 
the statistics of the data during this learning period change, the 
coefficients are changed to provide an updated version of the predictor 
filter. 

Although the most obvious applications of this adaptive predictor 
would be in the transmission of television or some other very redun- 
dant analog signal, we choose here to explore its application in digital 
data transmission. In the past, little attention seems to have been 
focused on the use of prediction in digital transmission. Presumably 
this is because the most effective use of prediction would be in the 
compression of the analog wave from which the digits are taken. 

However, there do exist situations in which the input digital signal 
is not under the control of the transmission systems designer. This 
occurs notably in the design of data communications equipment. 
Although it has been common practice to use redundancy in speech 
signals to ease transmission system requirements (the TASI system 
is a dramatic example), nothing similar has been attempted with 
digital data signals. There would seem to be no compelling reason 
why any redundancy in digital signals should not be taken advantage 
of, as long as the error statistics of the output data were not ad- 
versely affected by the procedure. After describing a digital redund- 
ancy removal and restoration system we shall discuss its possible 
benefits to the customer and to the transmission plant. 

II. SYSTEM DESCRIPTION 

Figure 2 shows a digital redundancy removal and restoration scheme. 
For simplicity we assume that the input digits a„ are binary, although 
the technique obviously extends to multilevel transmission. The input 
sequence is passed through a shift-register transversal filter whose tap 
gains c k have been adjusted so that the filter output d„ , where 

if 
d n = 5Z c k a n - k , (1) 

is a linear least squares prediction of a n . This prediction is subtracted 
from the actual sample a n and only the difference e„ is passed to the 
modulator for transmission. Notice that, although a„ is a binary variable 
taking on the values ±1, both d n and e n are analog. Unless the digits 
a n are uncorrelated, the error samples e„ will have smaller variance than 
the unit variance of the input data. Consequently, a linear modulator 
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Fig. 2 — Digital redundancy removal and restoration. 



will put out less line power in transmitting the error samples than in 
transmitting the original data. 

After demodulation at the receiver, the missing, predictable, com- 
ponent d n must be added to the error sample e n before slicing, in order 
to recover o n . This component is obtained by a bootstrap arrangement 
wherein the detected symbols are passed through a transversal filter 
identical to that at the transmitter in order to form the predictions d n . 
The receiver is similar in arrangement to the circuitry used in dc restora- 
tion. 

There are two relatively simple ways in which this system could 
be used to improve transmission efficiency. As shown in Figure 2 the 
system lowers the average transmitted power without appreciably 
affecting the output data error rate. In this mode of operation any 
benefit from the data redundancy is used to lower the load require- 
ments on the transmission plant. If many data sets were equipped 
with such circuitry, the average power handled by the plant would 
be lowered in a statistical fashion. Some sets, transmitting entirely 
random data, would require their normal power complement. Others, 
transmitting redundant data, would require considerably less. Notice 
that this is exactly the type of effect which now takes place for voice 
transmission. 

As the input data becomes entirely redundant in the limit, the 
transmitted power goes to zero. In this case the input data consists 
of a periodic pattern. In spite of the zero-level line signal, the pat- 
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tern is reconstructed exactly at the receiver (in the absence of noise). 
Such an eventuality would alleviate the problems now encountered 
with the transmission of periodic data. These data patterns normally 
lead to tones, that is, line spectra, in the transmission channel which 
cause certain overloading and other system malfunctions. 

Currently the problem is being treated in wideband transmission 
by the introduction of digital scramblers. In practice the zero- 
level transmitted signal would not be a satisfactory solution to the 
tone problem since some signal strength would be required for syn- 
chronizing and timing maintenance. However, proper design of the 
system could ensure that some minimum signal strength was main- 
tained under all circumstances. For example, a nonlinear element in 
each predictor could be used to keep the predictions smaller than 
unity. As long as the same nonlinearity were used in both transmitter 
and receiver, the data signal would be reconstructed perfectly at the 
receiver. 

The other simple way to use redundancy removal to aid transmis- 
sion would be to keep the level of transmitted power constant while 
lowering the probability of error. In this case, compensating gain 
controls would be placed at the transmitter output and at the re- 
ceiver input. These controls would be adjusted to keep the transmitted 
power constant regardless of signal redundancy. During periods of 
redundancy most of the voltage presented to the sheer at the receiver 
would come via the feedback predictor and therefore would be noise- 
less (in the absence of errors). Since the small error signal transmitted 
would be greatly amplified to keep line power constant, the total noise 
presented to the slicer after complementary deamplification would be 
much smaller than in normal transmission. Consequently, the error 
rate would be diminished during periods of redundant data trans- 
mission. 

Complementary amplification and deamplification surrounding chan- 
nel noise introduction are automatically accomplished in transmission 
over compandored facilities. Normally for these channels we would 
expect that the error rate would be independent of transmitted power 
level. In the redundancy removal system, however, this mechanism 
is defeated by using the noiseless feedback in the detection process. 

There are further uses of redundancy removal in data transmission, 
but they appear to involve more complicated system arrangements. 
For example, the bit rate and bandwidth of the data signal could be 
lowered for redundant data. This could be accomplished by slicing 
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the prediction d n to obtain a closest digital prediction and then sub- 
tracting d n from a„ in digital form. The resulting error digits could 
then be processed by run-length encoding to achieve message com- 
pression. Of course we would then need a buffer to ensure a constant 
channel bit rate. We will not discuss this type of system further here. 
Thus far we have alluded to the possible benefits of redundancy 
removal in data transmission. There is also one major drawback— 
that of error propagation. Since the estimate d„ at the receiver de- 
pends on the correct reception of all previous data, the compensation 
at the receiver is perfect only in the absence of errors. When an error 
occurs, the probability of error in succeeding bits tends to be larger 
and an error propagating effect occurs. Notice that this effect does 
not depend on the particular circuit configuration for its existence, 
but is a philosophical necessity in any redundancy removal operation. 
We analyze the effect of error propagation in a simple example in 
Section V. Normally we would not expect the error propagation to 
increase the entire error rate by more than a small algebraic factor. 

III. THE ADAPTIVE PREDICTION FILTER 

In the theory of linear prediction developed by Wiener 5 and others 
it is assumed that the input samples a n are taken from a stationary 
time series with known covariance function R (n), where 

E[a m a n ] = R(m - n). (2) 

The power output, which is the mean square prediction error, is 

P = E[el] = E\(a H - £ c k a n . k )\- (3) 



The coefficients c fc ; k = 1, . . . N, which minimize this prediction 
error, can be obtained by the solution of the N simultaneous equations 

f2c k R(n ~ k) = R(n); n = 1, 2, • • • , N. (4) 

<t=i 

In case of an infinite filter {N = oo) the coefficients c k and the 
prediction error are given by a method involving factoring of the 
spectral density (?(/) of the input process. Under proper conditions 
the prediction error P can be expressed in the form 

P = exp [ f log G(j) rf/j (5) 
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(See Doob for the mathematical niceties of this result. 10 ) Notice that 
if the input symbols are independent, G(f) = 1, | / | < Mj> an d 
P = 1. Since the input power is also unity no gain is achieved by 
the prediction process. If, on the other hand, G(f) is not flat the pre- 
diction error, P is less than unity and power is saved. 

While the mathematics of linear prediction for stationary time 
series serve as a guide to actual system performance, it is clear that 
the assumptions are philosophically inadmissible. Furthermore, since 
the data source is outside the designer's control, it would be extremely 
unlikely that the covariance function would be known in advance. 
For these reasons, Balakrishnan 11 in 1961 developed a mathematical 
formulation for a learning or adaptive predictor wherein the form of 
the prediction operator was dependent solely on the past data and 
not on any assumptions of stationarity or of prior knowledge of data 
statistics. 

In Balakrishnan's formulation that prediction operator is chosen 
as optimum at time t„ which works best when applied at times 
t„-i, . . . , t„-L. Since all past information is available, we could "try 
out" all possible prediction operators on the previous data and select 
the operator for which 

E n = Z [a„-,. - a n -,]V (6) 

is minimum. The weights Wj could be used to assign a relative im- 
portance to each past trial of the predictor. 
For our finite linear predictor we have 



E n = 2 «--, - X) c A .a n _._ t w, . 

1-1 L A = l J 



(7) 



In order to develop a physical implementation for this adaptive filter 
we use a motivation based on a steepest descent approach. The deriv- 
atives of the error E„ with respect to the coefficients c m are 



i-\ _ A=l J 



0E„ 

dC m 



(8) 



^ - - £ 2w i e n . i a n . i - n . (9) 

ac„, j _ i 

Notice that these derivatives can be obtained by passing the product 
of sample a„_„ ( and the error voltage e„ through a filter with impulse 
response {«',)• Thus we are led to the adaptive filter configuration 
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shown in Figure 3. This configuration is entirely similar to that cur- 
rently being used for equalization 12 and for echo suppression. 13 - 14 

When the input samples a„ are digital, the circuitry of Figure 3 is 
quite simple. The delay line becomes a shift register and the multi- 
pliers become simple polarity switches. However, the circuit is not 
limited to digital applications, but could be used in such analog 
functions as telemetry or television compression systems. 

In any event, the response of the system, involving accuracy and 
settling time as well as stability, is controlled by selection of the 
smoothing niters W(<»). Basically these filters must perform an aver- 
aging followed by an integration. If the data were stationary and 
the memory L sufficiently long, the result of averaging the product 
of the error and sample voltages for the m tu tap coefficient would 
give (see equation 8) 



y m (t) ^ E[a n . m e„) = R(m) - Y,c k {t)R(™ - k). 



(10) 



Then these voltages would be integrated for use as tap coefficients, 
so that the governing system equations would be 

c m (t) = A R{m) - 2 c k (£)R(m - k) J for m « 1, • • • , JV. (11) 

This system would be stable for all A, since the covariance matrix, 
whose nm th entry is R{n-m), must be positive definite (see Davenport 
and Root 15 ). All voltages y,„(t) would be asymptotically reduced to 
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Fig. 3 — Adaptive prediction filter. 
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zero and the filter coefficients would asymptotically approach those 
of the optimum (least squares) linear predictor of equation (4). 

For nonstationaiy data and realistic filters W(w) the analysis of 
the nonlinear, multidimensional control system is extremely compli- 
cated. Let us study the dynamics of the one-dimensional system 
formed by using a one-tap predictor as a guide to the behavior of 
the system. 

In order to put this analysis into proper perspective with regard 
to the system of Figure 2 we should observe that when the input data 
statistics change abruptly, both transmitter and receiver predictors 
undergo the same transients. If the predictors are identical, these 
transients cancel exactly at the receiver summer and no loss in noise 
margin is suffered. However, the statistics of the transmitted signal 
are affected by only the transmitter predictor. Therefore, the proper 
design of the adaptive predictor is crucial to obtaining desirable line 
power statistics, but not to the performance of the entire system. 

IV. THE ONE-TAP TRANSMITTER FOR BINARY DATA 

Figure 4 shows a one-tap transmitter with a binary input signal 
of the form 



a(0 = Z a n r(t - nT) 




KD =-V = ^ • (12) 

[0 elsewhere 

The transmitted voltage is given by 

e(t) = s(t) - c(t)s(t - T) (13) 

where 

c{t) = Aiv(t)*[s(t - T)e(t)]. (14) 

Because of the binaiy nature of the input ^(t) = 1 and thus 

c(t) = Aw(t)*[s(t)s(t - T) - c(t)]. (15) 

Let m(t) = s(t)s(t—T) ; then the Laplace transform solution for 
C(s) is* 



*Some liberty has been taken with the shift-register starting state. 
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C(s) = 



AW(s)M(s) 



(16) 



1 + ATF(s) 

Now returning to equation (13) we multiply both sides by s{t—T) 
to obtain 

e(l)s(t - T) = m(l) - c(t). (17) 

Combining equations (16) and (17) gives 

e(t)s(t - T) = m(t)*h(t) (18) 

where 

1 



H(s) = 



(19) 



1 + AW(s) 

The output signal itself can be written by again multiplying equation 
(18) by s(t-T) 

e(0 = s(t - T)[m(t)*h(t)]. (20) 

Notice that the special properties of binary sequences have been used 
in arriving at this solution, so that equation (20) does not hold for 
multilevel or analog input. 

Figure 5(a) shows the mathematically equivalent transmitter 
given by equation (20) as well as its corresponding receiver. Since 
the second multiplier does not affect the transmitted power in any 
way, both transmitter and receiver can be simplified by its removal 
to result in the equivalent represented by Figure 5(b).* This final 



* The systems differ in their noise performance, however. 
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equivalent system is amazingly simple and appears to bear little 
resemblance to the initial system of Figure 4. It is interesting to 
observe that, while the initial system was termed "adaptive," no 
one would seriously consider its equivalent in Figure 5(b) as being 
adaptive in any sense. 

Figure 5(b) has an intriguing interpretation. The input data is 
first subjected to the nonlinear operation of delay and multiplication. 
The output of the multiplier is 



m 



(/) = Efl»«-;(l -nT). 



(21) 



This voltage has a mean value given by R{\) in the stationary case. 
If the filter W{u>) has been designed as a low pass filter, then the 
filter 1/[1 + yUF(u)] in the equivalent circuit is a high pass filter. 
Thus the dc component of m{t) is removed before transmission and 
reinserted via a dc restorer at the receiver. In other words, a nonlinear 
operation on the input signal has converted the correlation into a 
spectral line which can then be removed by a time invariant linear 
filter. It would seem that some generalization of this concept should 
be possible, but as yet none has been found. 
The equivalent circuit can be used for design purposes in selecting 
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W(a), or for calculating line power or transient response. Here are 
the results of a few straightforward examples. 

Example 1 

Simple RC filter, dotting pattern input applied at time zero: 

w(s) = ^ ; « = ^ 

a n = { +1 ' nGVen - (22) 

[— 1 , n odd 

A deterministic sequence is to be transmitted. We find that the output 
of the equivalent circuit is 

e(t)e(t - T) = -[xqn^) + ATI*""""} (23) 

Thus the error voltage transmitted in the original circuit becomes 

c(l) = [t ( _ 1W .„„][ a J_ tt(0 + J A_.— '» ]• (24) 

The error voltage does not approach zero because of the lack of an 
integration in the smoothing filter. 

Example 2 

Simple RC filter, markov input: 

If the input is a first order Markov process the one-tap predictor 
becomes the optimum linear predictor. (We study this case more thor- 
oughly in the next section.) The covariance function of the input time 
series is taken to be 

R(n) = R M . (25) 

Since we now are dealing with a random input, our concern is with 
the transmitted power level rather than the exact waveform as in 
the previous example. The transmitted power is the same in Figures 
4 and 5b, so we use the simpler structure of the latter diagram for 
analysis. 

When the input Markov process is subjected to delay and multipli- 
cation, it can be shown that the resultant symbols (a„a n -i) have 
mean value R and are uncorrected. The spectral density of the 
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multiplier output m(t) is given by 

sin -y 

S,,(co) = R 2 «(«) + (1 - R 2 )T 7-4. (26) 



2 

This spectral density can be multiplied by | H(o>) | 2 and integrated to 
give the transmitted power. The power becomes 

^(irV + a-* 2 ) 

1(1 + A) 2 + L 1 " (1 + Ar\[ a(l + A)T Jf (2?) 

Ideally, of course, this power should be (1— R 2 ) , but the crude RC 
filter is unable to approximate this result unless the gain is high 
and the time constant (l/«) is large. 

Better results in both examples could be achieved by an improved 
selection of the filter characteristic W(<o). We can see from the 
equivalent circuit that the best choice of W(w) makes 1/[1 + iffw)] 
an efficient high pass filter with a transmission zero at &> — 0. Of 
course this must be compromised with any requirement on the filter 
response time. 

In this section we stress the use of the equivalent circuit as a 
method of analysis rather than as an implementable system. Clearly, 
if one were to build a one-tap binary predictor, the circuit of Figure 
5(b) would be preferred to that of the original system. However we 
believe that such a restricted system would not be of great practical 
interest. 

While the implementation of the simple equivalent circuit cannot 
be extended to wider application, it is hoped that the easy analysis 
of the simple system conveys some insight into the performance of 
multiloop systems. This would be particularly true if there were 
small interaction between taps on the multiloop system. Such a 
situation would occur if the covariance R (n) decreased rapidly with n. 

V. ERROR PROPAGATION 

When noise is added in the transmission channel there is some 
probability of the received digits being incorrectly detected by the 
sheer. Even though the transmitted power might have been substan- 
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tially reduced by the redundancy removal, the probability of an 
initial error is identical to that of a full power system. Once an error 
has been made, however, the probability of making subsequent errors 
is increased because of the incorrect symbol being used in redundancy 
restoration. Thus, errors tend to bunch together in the received data. 
Besides increasing the average probability of error this error propaga- 
tion considerably complicates the problems of error control in the 
entire system. 

Error propagation in dc restoration circuits has been examined by 
Zador, Aaron, and Simon. 16 ' 17 It appears to be a very complicated 
problem, in general, which is even more confused by the presence of 
the adaptive, pattern sensitive niters in the redundancy removal 
system we are considering here. Therefore, we shall attempt the 
analysis of only the simplest meaningful theoretical model. Both 
transmitter and receiver will have one-tap transversal filters as shown 
in Figure 4. The input data is taken to be a binary first order Markov 
process, with zero mean and covariance 



R(n) = R 



|n| 



The transition matrix for this process is : 



+ 1 -I 



a„ 



+ 1 



-1 



1 +R 
2 


1 - R 
2 


1 - R 
2 


1 + R 
2 



The ideal linear predictor for this time series is simply d„ = -Ra„-i 
and the average transmitted power using this predictor is 1 — R 2 . 
Since the ideal predictor uses only a single tap filter, the assumption 
of single tap filters in the actual system is not particularly restrictive. 
If additional taps were used, their gains would be small and their 
effect on error propagation would not be significant. 

We will assume that noise samples £ k , uncorrelated Gaussian 
random variables with zero mean and variance <r 2 , are added to the 
transmitted symbols in the channel. We further assume that suf- 
ficient smoothing is done at the transmitter so that the tap gain may 
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be fixed at its optimum value, R. Thus the transmitted samples are 

e k = a k - Ra k . x . (28) 

Now at the receiver we shall write the received symbols as (3 k a k . The 
parameter (3 k = ±1 indicates the absence (+1) or the presence (—1) 
of an error at time t k . If the tap gain at the receiver is denoted by 
the parameter c, the detected symbols can be written 

k a k = sgn [a k - a k - x (R - cfL t ) + £j. (29) 

Thus the error parameter fi k is 

0* = sgn [1 - a k a k - x {R - c/3*_,) + r, k ] (30) 

where rjn = £ k a k has the same statistical properties as fo. The proba- 
bility of error at time t k is the probability that (3 k = — 1, which is the 
probability that rj k is such that the term in brackets is negative. 

Now we must turn our attention to the behavior of the receiver tap 
gain c. If no errors are made, then this gain is identical to the trans- 
mitter gain and as k — > oo, c -» R. However, because of the presence 
of errors, the receiver tap gain tends to be different from the trans- 
mitter tap gain. At time t k the output voltage of the multiplier at 
the receiver is 

v k = P k a k p k -ia k -i — c. (31) 

The random variables v k are averaged to determine the movement of 
c. Notice that, since | Ptthcfik-iO-kr-i \ = 1, the magnitude of c cannot 
exceed unity except as a transient starting state. This eliminates any 
possibility of a runaway in c resulting from unusual error patterns. 
We assume that the action of the loop at the receiver is to reduce 
to zero the expectation of the multiplier output voltage at time 
infinity. Thus 

E[v M ] = = lim^^aA-.i^-i] - c M . (32) 

t-00 

This type of final behavior would be exhibited by systems in which 
TT(oj) consisted of a long term averaging followed by an integration. 
The expectation of the term in brackets in equation (32) depends on 
Coo itself, so in general we end with a fairly complicated equation requiring 
a trial and error solution for c M . By taking the limit as k — > w of the 
expectation we eliminate the dependence on time and on the initial 
probability distributions for the random variables involved. 

Define a vector random variable a k = (a k , /3 t ) taking on the four 
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possible states (+1, +1), (+1, -1), (-1, +1) and (-1, -1), denoted 
by states 1 through 4, respectively. Because a k is Markov and since the 
expression for (t t in equation (30) involves only a k , a k -i , t _i , and 
Vk , we conclude that a is also Markov. The four-by-four transition 
matrix tt for 3 has entries p {i which may be calculated from the original 
transition matrix for the input symbols a k and from equation (30) for 
the probabilities of error in various states. Table I lists these transition 
probabilities. If the 4-entry row vector w {k) gives the probabilities of 
a k assuming each of the four possible states, then 

io a) = w*- x \. (33) 

In terms of the initial state distribution w i0) 

w (n) = *<°V. (34) 

For \R\ < 1 it is clear from standard Markov chain theory (see, 
for example, Reference 18) that steady-state probabilities exist for 

Table I — Transition Probabilities for a k = (a* , &) 



Pi2 = Pa* = 

Pia - p.ii = 

Pl4 = P82 ™ 

P21 = Pw = 

P22 = Pa* - 

P23 = 7>41 = 

P24 = P42 = 



I + R\ (1 - R+c 
1 - R 



1 + R - c 



■> L 1 " " 

1 ZLM \nl l A " ' 

1 +R 
2 

i +r 



1 -Q 



1 - R - c 



- HM 1 ^^) 



2 

1 - R 



1 -Q 



1 - R\ (\ +R + c 



l+R + c 
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the transition matrix ir, that is, w (n) approaches a constant vector w as 
n — > oo independent of iv (0) . The steady-state probabilities of the four 
possible states can be obtained by the solution of the equations given by 

wir = id. (35) 

Some algebraic manipulation yields the probabilities 

«,, = P(a„ = +1, ft. = +1) = _ ' (1 ~ P"_~ P" ] (36) 

i />22 T^ Pl2 P24 r Pl4 

w 2 = P(a„ = +1, ft, - -1) - | - Wl (37) 

u> 3 = P(a M = -1, ft, = +1) - w x (38) 

w 4 = P(a a = -1, ft, - -1) - * - «,, (39) 

where the transition probabilities p 12 , pu, P22, find p 2 4 are given in 
Table I as functions of c, i2, and a. 

The expected value of the multiplier output at time infinity can 
now be written in terms of the steady-state probabilities i# t and the 
transition probabilities p</. 

EM = Wtlpu - p 12 - p 13 -I- pi„] + w 2 \p 22 -f p 23 - p 21 - p 24 ] 

+ w 3 [p 32 + P 33 ~ P»i _ P 3 ^ + ^tPii + p 44 - P42 - p i3 ] ~ c. (40) 
Again some algebraic manipulation yields the result 

E r ,_ .R[1-Pl4-P24-P22-Pl2] + 2[P,4-P,2] + 4[P 22 P 12 -P 2 ,p, 4 ] ^ .^ 

1-P22+P.2-P24+PU 

The value of the tap gain at time infinity can be found by trial and 
error. A value of c is assumed, the transition probabilities are computed 
and E[v K ] is found. The value of c for which E[v x ] = is c„ . Notice that 
under suitable assumptions E[v M ] gives the rate of change of the coef- 
ficient c in the dynamic action of the system. 

The probability of error after the system has settled is simply the 
probability that «„ is in a state where /3 W = —1, which is simply (w 2 + 
«? 4 ). 

P = p ' 2 + Pu (42) 

1 - P>2 + P12 - P24 + Pa 

The transition probabilities here must be computed using r x . 

Expressions (41) and (42) have been written in terms of only 
those transition probabilities which involve errors. Thus, as <r -> 0, 
each of the transition probabilities in (41) and (42) approaches zero, 
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c x -» R, and P c -> 0. Each of these probabilities can be visualized 
as the probability that the noise (zero mean, variance a 3 ) is greater 
than the one of these four thresholds: 



(multi 



iply by 



1 + R 
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multiply by 
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1 - R y 
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Thus p 2 4 is the smallest transition probability, while p» 2 is the largest. 
If the transition probabilities are small, it can be seen from equa- 
tion (42) that P c is principally determined by (p 12 + Pu), which is 
minimized by c = R. Also we notice from equation (42) that the tap 
gain c approaches R very closely for small transition probabilities. 
In general, however, c = R will not be the best setting to minimize 
the error probability in equation (42), nor is it the setting to which 
the loop settles. Unfortunately it appears that these are not compensat- 
ing offsets. For example, in Figure 6 we have plotted P e and E[v x ] 
against c, for a case in which R = 0.4 and a = 0.4. Although neither 
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Fig. 6 — Probability of error and E[v oo] vs receiver tap gain c. 
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effect is very significant, it can be seen that the system settles (E[v x ] 
= 0) for a value of c somewhat smaller than R, while the minimum 
error probability is obtained at a value of c somewhat larger than R. 
In all but the most severe noise conditions the approximation of 
c OT = R would be satisfactory and we would have 



(43) 

But Q(1/ct) is the probability of error in the original system (no 
redundancy removal). If this probability, called P e o, is small, then 
Q(l + 2R/a) is much smaller and we have the very good approxi- 
mation 



P. 



/'.„ mniill = 



'' (44) 



[- -- f&H^*) 



The factor in the denominator gives the amplification of the original 
error rate due to error propagation. Finally if R > 1/2, then Q(l — 
2R/a) approaches unity and we get the severe dependence upon R 

P.k-..:S^- (45) 

The most significant aspect of the error propagation behavior of 
the circuit is that the redundancy removal and restoration system 
has impressed the statistics of the input data (Markov here) upon 
the error statistics of the output. It is clear that this philosophy 
would hold in general. In the case of highly correlated input we would 
end with highly correlated errors. The problems of error control could 
be made quite severe in this manner. 

VI. EXPERIMENTAL RESULTS 

A three-tap, adaptive transmitter and a similar receiver were de- 
signed and constructed by V. G. Koll. The system was designed for 
binary data transmission so that the multipliers in Figure 3 became 
polarity switches, while the delay line took the form of a shift register. 
The filters Wis) consisted of simple RC low pass sections followed 
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by integrators, that is, 

W (s) = -y-^-r- (46) 

s(s + oc) 

With this choice of smoothing, the steady-state error for a periodic 
input (period 3 or less here) was zero. It was in fact observed that 
during the transmission of periodic data the transmitter could be 
disconnected with no effect on the received data pattern. 

The input data for the system was obtained by passing white Gaus- 
sian noise through a variable cutoff, low pass filter. If we assume an 
ideal low pass filter, with cutoff frequency W Hz, then the autocor- 
relation function of the filter output is 

R.M = awf^l^} (47) 

This voltage is then sampled at rate (1/T) and subjected to infinite 
clipping so as to produce the correlated input bits. Van Vleck and 
Middleton 19 show that the resulting autocorrelation is 

_. 2 . _, f sin 2mWT '\ ,.„, 

W-;nn L 2mWT J' (48) 

For a filter cutoff of 1/2T Hz the data is uncorrelated. By decreasing 

the filter cutoff frequency the redundancy in the data can be increased. 

The action of the adaptive redundancy remover is shown in Figure 

7 for two different values of filter cutoff. Notice that as the redundancy 
is increased the transmitted waveform has longer periods of near zero 
voltage where predictability is good and occasional peaks where the 
predictor is "surprised." Except for a few minor discontinuities the 
reconstructed signal before slicing at the receiver is the same as the 
original input waveform at the transmitter. The relative power saving 
as a function of filter cutoff is shown in Figure 8. 

In order to predict system performance in Gaussian noise we make 
the crude approximation that the input process is Markov with R(l) 
as given in equation (48). According to this approximation the trans- 
mitted power should be 1 - R{1)-. This value is also shown in Figure 

8 in comparison with the actual measured power output. Since the 
exact correlation function is known, the theoretical signal power output 
could be computed precisely through equation (4) . However, we have 
no corresponding means of computing the degree of error propagation 
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Fig. 7 — Transmitted and reconstructed signals, (a) Filter cutoff uT = 0.4 
[little redundancy, R(l) = 0.15]. (b) Filter cutoff wT = 0.1 [moderate redun- 
dancy, R(l) = 0.77]. 
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for the non-Markov source. The approximate curve of signal power 
in Figure 8 is shown only as a way of evaluating the Markov ap- 
proximation for later use in predicting error propagation values. 

Bandlimited white Gaussian noise was added to the transmitted 
signal, and error rates were experimentally determined by V. G. Koll 
at a number of filter cutoff (redundancy) positions. The results of 
these tests are shown in Figure 9 in curves of probability of error 
versus signal-to-noise ratio. Beside these measured curves have been 
plotted theoretically computed curves which are based on the Markov 
approximation and on the use of equation (43) for P g . 

Although all necessaiy information for performance determination 
is contained in Figure 9, it is instructive to plot two additional curves 
of probability of error versus filter cutoff. These curves are shown 
in Figure 10. In one curve the transmitter and receiver gains are held 
constant so that the line power decreases according to the curve of 
Figure 8 while the probability of error increases with increasing re- 
dundancy because of the effects of error propagation. In the other 
curve of Figure 10 the transmitter and receiver gains have been 
adjusted with increasing redundancy so as to hold line power constant. 
In this case the probability of error decreases with increasing redun- 
dancy. 
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Fig. 8 — Signal power saving by redundancy removal. 
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Fig. 9 — Performance of redundancy 
normalized filter rut off a//'. 



removal system at various values of 



VII. CONCLUSION 

We have advanced two main points. First we suggest the possibility 
of using an easily-implemented adaptive predictor for data compres- 
sion systems. Second, we investigated the use of this adaptive predictor 
in digital transmission. 

We have seen that the predictor can be used to increase transmission 
efficiency for redundant data either by decreasing signal power for a 
given error rate or by decreasing probability of error for a given signal 
power. Although the required circuitry for the digital application is 
quite simple, it is nearly impossible to make an economic evaluation 
of the system because of the complete lack of knowledge of the prev- 
alence and degree of redundancy in customer input data. 
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